Merging Case Relations into VSM to Improve Information Retrieval Precision
نویسندگان
چکیده
This paper presents an approach that merges case relations into the well-known Vector Space Model (VSM), leading to a new model named CVSM (Case relation-based VSM). A Chinese case system with 23 case relations is established, and a Chinese Olympic news corpus of 7,662 sentences, denoted COCS, is constructed by manual annotation with these 23 case relations. We use 50 queries on COCS as a test set. Experimental results on the test set show that C-VSM outperforms W-VSM (Word-based VSM) by 3.4% on the average 11-point precision. It is worth pointing out that almost all the previous studies on semantic IR obtained no better, even worse, results than W-VSM, our work thus validates the usefulness of case relations in IR through the validation is still preliminary. The proposed model is believed to be language-independent.
منابع مشابه
Combining Text Vector Representations for Information Retrieval
This paper suggests a novel representation for documents that is intended to improve precision. This representation is generated by combining two central techniques: Random Indexing; and Holographic Reduced Representations (HRRs). Random indexing uses co-occurrence information among words to generate semantic context vectors that are the sum of randomly generated term identity vectors. HRRs are...
متن کاملTopic Analysis for Psychiatric Document Retrieval
Psychiatric document retrieval attempts to help people to efficiently and effectively locate the consultation documents relevant to their depressive problems. Individuals can understand how to alleviate their symptoms according to recommendations in the relevant documents. This work proposes the use of high-level topic information extracted from consultation documents to improve the precision o...
متن کاملThe phrase - based vector space model for automatic retrievalof free - text medical documents q Wenlei Mao , Wesley W . Chu
Objective: To develop a document indexing scheme that improves the retrieval effectiveness of free-text medical documents. Design: The phrase-based vector space model (VSM) uses multi-word phrases as indexing terms. Each phrase consists of a concept in the unified medical language system (UMLS) and its corresponding component word stems. The similarity between concepts are defined by their rela...
متن کاملFusion of Retrieval Models at CLEF 2008 Ad-Hoc Persian Track
Metasearch engines submit the user query to several underlying search engines and then merge their retrieved results to generate a single list that is more effective to the users’ information needs. According to the idea behind metasearch engines, it seems that merging the results retrieved from different retrieval models will improve the search coverage and precision. In this study, we have in...
متن کاملBroadening Vector Space Schemes for Improving the Quality of Information Retrieval
The vector space model (VSM) of information retrieval suffers in two areas, it does not utilise term positions and it treats every term as being independent. We examine two information retrieval methods based on the simple vector space model. The first uses the query term position flow within the documents to calculate the document score, the second includes related terms in the query by making...
متن کامل